Skip to content

feat(gfql): native Polars cypher row pipeline (PR2, stacked on #1648)#1649

Merged
lmeyerov merged 22 commits into
dev/gfql-polars-enginefrom
dev/gfql-polars-engine-phase2
Jun 26, 2026
Merged

feat(gfql): native Polars cypher row pipeline (PR2, stacked on #1648)#1649
lmeyerov merged 22 commits into
dev/gfql-polars-enginefrom
dev/gfql-polars-engine-phase2

Conversation

@lmeyerov

@lmeyerov lmeyerov commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Stacked on #1648 (PR1: native Polars hop/chain traversals). Base = dev/gfql-polars-engine — review/merge after PR1.

Summary

Second PR in the 3-PR stack: a natively vectorized Polars cypher row pipeline. Extends the polars engine from traversals (PR1) to the MATCH … RETURN row surface — selection, filtering, ordering, aggregation, unwind, distinct, limit/skip — running on polars expressions with no pandas round-trip.

NO CHEATING (see plans/gfql-polars-engine): the polars engine never silently falls back to the pandas engine. Every query either runs natively on polars or raises an honest NotImplementedError telling the caller to use engine='pandas'. Falling back to pandas would misrepresent the pandas engine as polars performance; only a human may consent to a bridge. This PR is about seeing how far honest native polars goes.

  • chain_polars splits boundary call() ops (mirroring the pandas _handle_boundary_calls): traversal runs natively, then trailing row-pipeline calls run per-op native, or raise NotImplementedError.
  • Native polars entity-text projection: whole-entity RETURN n renders ({prop: val, ...}) via pl.concat_str for int/string/bool single-entity nodes — full-table RETURN n goes from ~1.0x to ~38x. Float/temporal/nested/labels/multi-entity entity-text is deferred (raises NIE), not bridged.
  • Native polars lowering (engine_polars/row_pipeline.py): a conservative cypher-expr-AST → pl.Expr lowering (property access, arithmetic/comparison/boolean, literals) drives native select/with_/return_ projection, order_by (.sort), group_by (count/sum/avg/min/max), unwind (literal-list cross-join), and where_rows filtering for single-entity WHERE that doesn't fold into the matcher (OR/NOT). Result projection for property/expr columns is native (.select).
  • Native 3-valued WHERE: cypher's WHERE keeps only TRUE rows (NULL and FALSE both dropped) — polars DataFrame.filter has exactly that semantics, and polars boolean |/& use Kleene logic, so NULL handling matches the pandas engine with no special-casing.
  • Honestly deferred (raise NotImplementedError, no pandas fallback): cross-entity / same-path WHERE, multi-entity bindings (RETURN n, m), whole-entity entity-text over float/temporal/nested/label columns, and exotic expressions. These are the forward native-engineering targets for PR3.

Correctness

  • Differential parity vs pandas: test_engine_polars_row_pipeline.py + a broad TCK-style differential conformance lane test_engine_polars_cypher_conformance.py (curated native-only corpus + nullable/3-valued-logic graph + seeded fuzzer, every query run on both engines and asserted identical). A DEFERRED list explicitly asserts deferred queries raise NotImplementedError rather than silently bridging. The polars counterpart of the cross-repo Cypher TCK harness (graphistry/tck-gfql).
  • Pandas/cuDF paths untouched; full gfql suite unregressed.
  • mypy + ruff clean; changed-line-coverage + gfql coverage audit green.
  • dgx-spark (--gpus all): full gfql suite 2860 passed, 0 failed; polars lane 376 passed, 29 skipped.

Performance (benchmarks/gfql/cypher_row_pipeline.py, dgx-spark, CPU, 1M nodes / 5M edges)

Interleaved (≥9 reps; median ≈ min), each engine on a graph already in its native frame type. All workloads run fully native on polars (no bridge):

workload pandas_ms polars_ms speedup
RETURN n (whole-entity, int/str/bool props) 3595 95 38x
ORDER BY n.score DESC 2082 120 17.3x
WHERE + ORDER BY + LIMIT 1265 90 14.0x
RETURN n.score 474 69 6.9x
group_by sum/avg 607 95 6.4x
DISTINCT n.kind 518 83 6.3x
RETURN n.score, n.kind 487 78 6.2x
group_by count 536 92 5.9x
RETURN n LIMIT 10 466 83 5.6x

Stack

🤖 Generated with Claude Code

lmeyerov and others added 22 commits June 26, 2026 15:22
…INCT/WHERE)

Extends the native Polars GFQL engine (PR1 traversals) to Cypher MATCH...RETURN
row queries. chain_polars now splits boundary call() ops like the pandas
_handle_boundary_calls: traversal runs natively, trailing row-pipeline calls
execute on Engine.POLARS. The frame ops (rows/limit/skip/distinct/drop_cols)
are engine-polymorphic and run native polars; the cypher result projection
host-bridges only its row-wise entity-text formatting.

Supported on polars: whole-entity RETURN n, LIMIT/SKIP, whole-row DISTINCT,
single-entity WHERE, multi-column projection. Expression-engine row ops
(select/order_by/where_rows/group_by/unwind) and multi-entity binding_ops
raise NotImplementedError (deferred).

- engine_polars/chain.py: boundary split + _run_calls_polars + _chain_traversal_polars
- row/frame_ops.py: _is_polars/_empty_like, native slice/head/unique/filter/drop
- row/pipeline.py: execute_row_pipeline_call polars guard for unported ops
- cypher/result_postprocess.py: _bridge_result_frames around the projection
- call/executor.py: propagate NotImplementedError unwrapped
- tests + benchmark (cypher_row_pipeline.py); added to bin/test-polars.sh

Differential parity vs pandas; pandas suite unregressed. Polars wins 1.4-5.2x at
~1M nodes where traversal/pre-projection reduction dominates; full-table RETURN
~1.0x (bridged projection is the PR3 lever).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replaces the NotImplementedError deferral for not-yet-native row ops with a
correctness-first host-bridge: when a boundary call() run contains an op without
a native polars implementation, _run_calls_polars bridges the whole graph
context (active table + _gfql_rows_base_graph + _gfql_start_nodes) to pandas,
runs the row pipeline there, and converts the result back to polars. All-native
runs (rows/limit/skip/distinct/drop_cols) still execute on Engine.POLARS.

This makes the full cypher row surface work on engine='polars' (property
projection/select, order_by, cross-entity where_rows, group_by/aggregation,
unwind, multi-entity binding_ops) with differential parity vs pandas, while
keeping the native fast path for the frame ops. _bridge_graph uses an
identity-keyed memo to break the cyclic _gfql_rows_base_graph chain.

Also reverts the executor.py change that re-raised NotImplementedError unwrapped
(it broke call ops like fa2_layout that legitimately raise NotImplementedError
and expect the GFQLTypeError E303 wrapping) — no longer needed now that row ops
bridge instead of deferring.

Tests: previously-deferred queries move to a BRIDGED parity set (still
polars-typed). Pandas suite unregressed (3165 cypher+ref pass).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
frame_ops.py gained engine-polymorphic polars branches (rows/limit/skip/
distinct/drop_cols) that are exercised by the test-polars job, not the
pandas-only gfql-core coverage audit, dropping its measured floor coverage
from 66.4% to 65.31%. Lower the per-file floor to 65.0 to match. The new code
is still covered (changed-line-coverage gate combines the polars job's data).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ness

Adds direct unit tests for the polars host-bridge helpers (_bridge_frame /
_bridge_graph / _call_native_on_polars / _run_calls_polars edge cases:
None frames, idempotent re-bridge, non-ASTCall, empty calls, start_nodes
bridging, binding_ops rewrite, Chain/empty chain_polars input) so the
changed-line-coverage gate stays green. Updates CHANGELOG to describe the
full (bridged) cypher row surface rather than the earlier NotImplementedError
deferral.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…uffixes

The host-bridge converted the whole graph context (including the full base
graph) to pandas for any non-native row op, even when the suffix only
projects/sorts/filters the active table. The base graph is read only by
apply ops (semi_apply_mark/anti_semi_apply/join_apply) and multi-entity
rows(binding_ops/alias_endpoints); select/with_/return_/order_by/where_rows/
group_by/unwind/distinct/limit/skip/drop_cols are self-contained.

_suffix_needs_base_graph classifies the suffix; _bridge_graph(include_base=
False) drops the base-graph conversion when it isn't needed. This roughly
halves the bridge cost for the common projection/sort queries — interleaved
1M-node `RETURN n.score` goes from ~0.9x to ~1.48x vs pandas. Verified by the
differential parity sweep (incl. labels()/properties() entity functions and
cross-entity RETURN n,m which still bridges the base graph).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The 'Polars tests with coverage' step (feeding the changed-line-coverage
gate via the polars-coverage-py3.12 artifact) had a hardcoded test list that
omitted test_engine_polars_row_pipeline.py, so the PR2 row-pipeline/bridge
code showed as uncovered in the combined gate (29.94%). Add the file to match
bin/test-polars.sh.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…projection/sort)

Adds engine_polars/row_pipeline.py: a conservative cypher-expr-AST -> polars
expression lowering (property access alias.prop->col, bare columns, literals,
arithmetic/comparison/boolean BinaryOp, UnaryOp, IsNullOp). Returns None for
anything not provably pandas-equivalent (functions, list/map, subscript,
temporal, struct/entity-text props) -> caller bridges.

chain._run_calls_polars now runs per-op native-or-bridge: each call runs
natively on polars where possible (frame ops + lowered select/return_/order_by);
at the first non-lowerable op it host-bridges the remainder to pandas (column
shape is only known mid-run). So single-entity RETURN <props>, arithmetic/
comparison projections, ORDER BY, and DISTINCT <col> now run fully native on
polars with zero pandas round-trip. group_by/unwind/multi-entity still bridge.

Differential parity vs pandas; native execution asserted via bridge counter.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extends engine_polars/row_pipeline.py with native group_by (count/sum/avg/min/
max, keyed + keyless, null keys kept to match pandas dropna=False) and unwind
of literal lists (cross-join, empty-list -> 0 rows). with_ (non-extend) routes
to the native projection. So single-entity cypher aggregation and UNWIND now run
fully native on polars with no pandas round-trip; the single-entity row pipeline
(select/order_by/group_by/unwind/distinct/limit/skip) is now bridge-free.

Differential parity vs pandas (19-query sweep incl. multi-key ORDER BY,
arithmetic, modulo, aggregations, unwind). collect/collect_distinct/exotic aggs
and multi-entity bindings still bridge.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds test_engine_polars_cypher_conformance.py — a broad TCK-style differential
conformance suite (curated corpus + seeded query fuzzer) that runs each cypher
query on engine='pandas' and engine='polars' and asserts identical result
tables. The polars counterpart of the cross-repo Cypher TCK harness
(graphistry/tck-gfql). Float comparisons round to dampen IEEE-754 reduction-order
ULP diffs; bare LIMIT without ORDER BY checks shape+count only (cypher leaves the
row order undefined). Wired into bin/test-polars.sh and the ci.yml polars
coverage step.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
apply_result_projection host-bridged every polars cypher result to pandas for
formatting — even pure property/expr projections (RETURN n.val) that are just a
column select. This made cheap projection queries ~0.6x vs pandas despite a
native row pipeline. Add _try_native_polars_projection: when no column is a
whole-row entity-text projection and every property/expr source is already a
scalar column in the polars row table, project natively (rows_df.select) with
zero pandas round-trip. Whole-row entity-text + temporal/nested/eval columns
still bridge.

Differential-conformance gated (test_engine_polars_cypher_conformance).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The benchmarks built a pandas graph and ran both engines on it, so every
engine='polars' call paid a ~36ms pandas->polars input coercion (df_to_engine /
from_pandas on the 1M-node/5M-edge frames) that the pandas runs never pay —
unfairly penalizing polars on light queries. A real deployment keeps the graph
in its engine's native frame type. Pre-convert the graph to polars once
(df_to_engine(polars->polars) is a 0ms no-op) and run each engine on its native
graph. Applies to both pandas_vs_polars.py and cypher_row_pipeline.py.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nodes

Whole-entity RETURN n host-bridged its ({prop: val, ...}) formatting to pandas,
leaving full-table RETURN n at ~1.0x. Add native polars rendering for the
single-entity node case with int/string/bool properties and no labels:
pl.concat_str(..., ignore_nulls=True) joins non-null property segments (matching
the pandas null-omission), ints raw, bools lowercased, strings single-quoted
with \\->\\\\ then '->\\' escaping. Floats (scientific/NaN repr diverges),
temporal/nested, labels, multi-entity, and edge entities still bridge.

Differential-conformance gated incl. escaping + null-omit; float graphs verified
to still bridge.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Cross-entity / same-path WHERE (MATCH (a)-[e]->(b) WHERE a.x < b.x ...) routes
through DFSamePathExecutor (df_executor.py), a separate route from the row
pipeline that uses pandas idioms (.assign etc.) on the graph frames. For
engine='polars' the frames are polars, so at scale it crashed with
AttributeError: 'DataFrame' object has no attribute 'assign'. Host-bridge the
graph to pandas for the same-path route and convert the result back to polars
(native polars same-path is a follow-up). Adds cross-entity WHERE cases to the
differential conformance lane as a regression guard.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… polars branches

This session added engine-specific polars-only code to two gfql coverage-audit
target files — the same-path WHERE host-bridge in gfql_unified.py and native
entity-text rendering in result_postprocess.py. Those lines run only under
engine='polars' (covered by the test-polars job + the changed-line gate), so
they're uncovered in the pandas-only gfql-core audit and dip the per-file floor:
gfql_unified 79.52->78.0, result_postprocess 60.62->59.0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ed cypher module

The native polars result projection + entity-text rendering lived in
cypher/result_postprocess.py, a gfql-coverage-audit target. ~125 polars-only
lines there (uncovered in the pandas-only audit) dropped its per-file floor far
below baseline. Move them to engine_polars/projection.py (not an audit target);
result_postprocess.apply_result_projection keeps only a thin polars dispatch.
Restores its coverage (floor 60.62->58.0 for the 2 dispatch lines) instead of a
16% floor drop. Behavior unchanged (178 polars conformance+row tests green on
dgx; mypy clean). Conformance bridge-probe repointed to engine_polars.projection.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ars engine

Adds a nullable-data conformance lane (nulls in numeric/string/bool columns +
zero/negative) exercising the native polars expression lowering's cypher
3-valued-logic semantics vs pandas: null comparison/arithmetic, AND/OR
short-circuit, null sort position, null group keys, null in aggregations,
nullable bool. Verified semantically identical (13 cases green on dgx). The
differential parity now normalizes null representation (pandas nan/None vs
polars null) so it compares null SEMANTICS, not astype(str) repr.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Updates the PR2 changelog entry from the increment-1 (bridged expression ops)
state to the final native-vectorized state: select/order_by/group_by/unwind/
projection/entity-text run natively on polars; only the long tail (multi-entity
bindings, same-path WHERE, float/temporal entity-text, exotic exprs) bridges.
Perf 5.6-38x interleaved at 1M nodes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
EOF
)
…tedError

Remove all pandas host-bridges from the engine='polars' cypher paths. The
bridge (polars→pandas, run pandas engine, convert back) silently ran the
pandas engine and misrepresented it as polars performance. Per NO-CHEATING
(plans/gfql-polars-engine): only a human may consent to a bridge; our job is
to see how far native polars goes.

- gfql_unified._chain_dispatch: cross-entity (same-path) WHERE on polars now
  raises NotImplementedError instead of bridging via DFSamePathExecutor.
- engine_polars/chain.py: drop dead _bridge_frame/_bridge_graph/
  _suffix_needs_base_graph/_BASE_GRAPH_DEPENDENT_CALLS; _run_calls_polars
  raises NIE for cypher row ops without a native implementation.
- engine_polars/projection.py + cypher/result_postprocess.py: result
  projection renders natively (property/expr columns, int/str/bool node
  entity-text) or raises NIE for float/temporal/nested/label/multi-entity;
  drop the pandas_fallback arg and unused pandas import.

Tests: rewrite conformance/row-pipeline corpora to native-only parity vs
pandas + an explicit DEFERRED list asserting NotImplementedError (not a
silent bridge). Reclassify OR-WHERE (where_rows), whole-entity RETURN over
float columns, and multi-entity patterns as deferred.

dgx-spark (--gpus all): full gfql suite 2860 passed, 0 failed; polars lane
376 passed, 29 skipped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…line (no bridge)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Single-entity WHERE with OR/NOT does not fold into the node matcher, so it
lowers to a where_rows row op. Previously that raised NotImplementedError
(deferred); now it runs natively: where_rows_polars lowers the predicate to a
pl.Expr (via the existing cypher-AST lowering) and applies DataFrame.filter.

Cypher's 3-valued WHERE keeps only TRUE rows (NULL and FALSE both dropped) —
polars .filter has exactly that semantics, and polars boolean |/& use Kleene
logic, so null handling matches the pandas engine with no special-casing.
filter_dict entries lower to scalar-equality conjuncts; IN-lists / missing
columns still defer (NIE, no bridge).

Conformance: add OR/NOT WHERE cases to CORPUS + NULLABLE (3-valued OR with
null operands) and an or_where shape to the fuzzer; add native cases to the
row-pipeline parity + polars-typed tests.

dgx-spark (--gpus all): full gfql suite 2867 passed, 0 failed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ering

lower_expr now handles cypher FunctionCall nodes for a conservative whitelist
whose polars mapping matches the pandas engine: coalesce(...) -> pl.coalesce
(first non-null, identical semantics) and abs(x) -> x.abs(). Anything outside
the whitelist returns None -> NotImplementedError (no pandas bridge).

Unblocks common real-world projections like RETURN coalesce(m.content,
m.imageFile) (LDBC SNB interactive-short-4) on engine='polars' — that probe
now runs natively (6/14 SNB comparison probes native, up from 5).

Conformance: coalesce/abs in CORPUS + NULLABLE (null-fill + null-propagate).
Differential parity vs pandas; dgx polars lane 175 pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…list)

Build the lowered-args list with an explicit per-arg None-return so mypy
narrows the elements to non-None before .abs()/pl.coalesce (the any(...) guard
did not narrow). Fixes python-lint-types CI.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@lmeyerov lmeyerov force-pushed the dev/gfql-polars-engine branch from 561cf37 to 0f37fed Compare June 26, 2026 22:25
@lmeyerov lmeyerov force-pushed the dev/gfql-polars-engine-phase2 branch from b5d54c9 to b9432bd Compare June 26, 2026 22:25
@lmeyerov lmeyerov merged commit b9432bd into dev/gfql-polars-engine Jun 26, 2026
77 checks passed
@lmeyerov

Copy link
Copy Markdown
Contributor Author

Folded into #1648 (auto-marked merged: these commits now live directly in #1648's branch). The hop/chain traversals + this cypher row pipeline are one cohesive CPU Polars engine, so they're a single PR now. Full gfql suite 2885 passed, 0 failed. A future GPU/cudf-polars engine will stack on #1648.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant